Skip to content

Conversation

@Ninja91
Copy link
Contributor

@Ninja91 Ninja91 commented Aug 29, 2025

Stack from ghstack (oldest at bottom):

Add 16A8W quantization support and comprehensive tests for the add operation in ExecutorTorch ARM backend targeting Ethos U55 and U85 NPUs.

This follows the pattern established for linear operations, extending int16 support to add operations with hardware-specific testing.

Changes:

  • Add INT16 dtype validation support in op_add.py
  • Add test_add_tensor_16a8w_tosa_INT test function with U55/U85 pipeline support
  • Add U55 and U85 specific 16A8W tests with proper xfail decorators
  • Fix U55/U85 test parameter usage (remove unsupported tosa_extensions, clean quantizer function calls)
  • Update xfail reasons to consistent 'Vela compilation fails with Invalid arguments' pattern

Differential Revision: D80510463

cc @digantdesai @freddan80 @per @zingo @oscarandersson8218

Add 16A8W quantization support and test for the add operation in ExecutorTorch ARM backend.

This follows the pattern established for linear operations, extending int16 support to add operations.

Changes:
- Add INT16 dtype validation support in op_add.py
- Add test_add_tensor_16a8w_tosa_INT test function
- Enable test_add.py in test targets configuration

The 16A8W configuration uses 16-bit activations with 8-bit weights, enabling higher precision for activations while maintaining weight efficiency.

Differential Revision: [D80510463](https://our.internmc.facebook.com/intern/diff/D80510463/)

[ghstack-poisoned]
@Ninja91 Ninja91 requested a review from digantdesai as a code owner August 29, 2025 04:20
@pytorch-bot
Copy link

pytorch-bot bot commented Aug 29, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/13789

Note: Links to docs will display an error until the docs builds have been completed.

❌ 3 New Failures, 1 Cancelled Job

As of commit 3dbf93f with merge base 1a7441f (image):

NEW FAILURES - The following jobs have failed:

CANCELLED JOB - The following job was cancelled. Please retry:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Ninja91 added a commit that referenced this pull request Aug 29, 2025
Add 16A8W quantization support and test for the add operation in ExecutorTorch ARM backend.

This follows the pattern established for linear operations, extending int16 support to add operations.

Changes:
- Add INT16 dtype validation support in op_add.py
- Add test_add_tensor_16a8w_tosa_INT test function
- Enable test_add.py in test targets configuration

The 16A8W configuration uses 16-bit activations with 8-bit weights, enabling higher precision for activations while maintaining weight efficiency.

Differential Revision: [D80510463](https://our.internmc.facebook.com/intern/diff/D80510463/)

ghstack-source-id: 305897355
Pull Request resolved: #13789
@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Aug 29, 2025
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D80510463

Add 16A8W quantization support and test for the add operation in ExecutorTorch ARM backend.

This follows the pattern established for linear operations, extending int16 support to add operations.

Changes:
- Add INT16 dtype validation support in op_add.py
- Add test_add_tensor_16a8w_tosa_INT test function
- Enable test_add.py in test targets configuration

The 16A8W configuration uses 16-bit activations with 8-bit weights, enabling higher precision for activations while maintaining weight efficiency.

Differential Revision: [D80510463](https://our.internmc.facebook.com/intern/diff/D80510463/)

[ghstack-poisoned]
Ninja91 added a commit that referenced this pull request Aug 29, 2025
Pull Request resolved: #13789

Add 16A8W quantization support and comprehensive tests for the add operation in ExecutorTorch ARM backend targeting Ethos U55 and U85 NPUs.

This follows the pattern established for linear operations, extending int16 support to add operations with hardware-specific testing.

Changes:
- Add INT16 dtype validation support in op_add.py
- Add test_add_tensor_16a8w_tosa_INT test function with U55/U85 pipeline support
- Add U55 and U85 specific 16A8W tests with proper xfail decorators
- Fix U55/U85 test parameter usage (remove unsupported tosa_extensions, clean quantizer function calls)
- Update xfail reasons to consistent 'Vela compilation fails with Invalid arguments' pattern
- Remove redundant u55_config parameter from get_symmetric_a16w8_add_quantizer function
- Enable test_add.py in test targets configuration for both fbcode and xplat

The 16A8W configuration uses 16-bit activations with 8-bit weights, enabling higher precision for activations while maintaining weight efficiency on ARM Ethos NPUs.
ghstack-source-id: 306430209
ghstack-source-id: 306430209
@exported-using-ghexport

Differential Revision: [D80510463](https://our.internmc.facebook.com/intern/diff/D80510463/)
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D80510463

Add 16A8W quantization support and comprehensive tests for the add operation in ExecutorTorch ARM backend targeting Ethos U55 and U85 NPUs.

This follows the pattern established for linear operations, extending int16 support to add operations with hardware-specific testing.

Changes:
- Add INT16 dtype validation support in op_add.py
- Add test_add_tensor_16a8w_tosa_INT test function with U55/U85 pipeline support
- Add U55 and U85 specific 16A8W tests with proper xfail decorators
- Fix U55/U85 test parameter usage (remove unsupported tosa_extensions, clean quantizer function calls)
- Update xfail reasons to consistent 'Vela compilation fails with Invalid arguments' pattern

exported-using-ghexport

Differential Revision: [D80510463](https://our.internmc.facebook.com/intern/diff/D80510463/)

Differential Revision: [D80510463](https://our.internmc.facebook.com/intern/diff/D80510463)

[ghstack-poisoned]
Ninja91 added a commit that referenced this pull request Aug 29, 2025
Pull Request resolved: #13789

Add 16A8W quantization support and comprehensive tests for the add operation in ExecutorTorch ARM backend targeting Ethos U55 and U85 NPUs.

This follows the pattern established for linear operations, extending int16 support to add operations with hardware-specific testing.

Changes:
- Add INT16 dtype validation support in op_add.py
- Add test_add_tensor_16a8w_tosa_INT test function with U55/U85 pipeline support
- Add U55 and U85 specific 16A8W tests with proper xfail decorators
- Fix U55/U85 test parameter usage (remove unsupported tosa_extensions, clean quantizer function calls)
- Update xfail reasons to consistent 'Vela compilation fails with Invalid arguments' pattern

ghstack-source-id: 306430209
ghstack-source-id: 306430209
@exported-using-ghexport

Differential Revision: [D80510463](https://our.internmc.facebook.com/intern/diff/D80510463/)
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D80510463



Add 16A8W quantization support and comprehensive tests for the add operation in ExecutorTorch ARM backend targeting Ethos U55 and U85 NPUs.

This follows the pattern established for linear operations, extending int16 support to add operations with hardware-specific testing.

Changes:
- Add INT16 dtype validation support in op_add.py
- Add test_add_tensor_16a8w_tosa_INT test function with U55/U85 pipeline support
- Add U55 and U85 specific 16A8W tests with proper xfail decorators
- Fix U55/U85 test parameter usage (remove unsupported tosa_extensions, clean quantizer function calls)
- Update xfail reasons to consistent 'Vela compilation fails with Invalid arguments' pattern

Differential Revision: [D80510463](https://our.internmc.facebook.com/intern/diff/D80510463)

[ghstack-poisoned]
Ninja91 added a commit that referenced this pull request Aug 29, 2025
Pull Request resolved: #13789

Add 16A8W quantization support and comprehensive tests for the add operation in ExecutorTorch ARM backend targeting Ethos U55 and U85 NPUs.

This follows the pattern established for linear operations, extending int16 support to add operations with hardware-specific testing.

Changes:
- Add INT16 dtype validation support in op_add.py
- Add test_add_tensor_16a8w_tosa_INT test function with U55/U85 pipeline support
- Add U55 and U85 specific 16A8W tests with proper xfail decorators
- Fix U55/U85 test parameter usage (remove unsupported tosa_extensions, clean quantizer function calls)
- Update xfail reasons to consistent 'Vela compilation fails with Invalid arguments' pattern

ghstack-source-id: 306430970
ghstack-source-id: 306430970
@exported-using-ghexport

Differential Revision: [D80510463](https://our.internmc.facebook.com/intern/diff/D80510463/)
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D80510463

@Ninja91
Copy link
Contributor Author

Ninja91 commented Aug 29, 2025

Closed #13653 as it's covered in this PR.

Ninja91 added a commit that referenced this pull request Sep 4, 2025
Pull Request resolved: #13789

Add 16A8W quantization support and comprehensive tests for the add operation in ExecutorTorch ARM backend targeting Ethos U55 and U85 NPUs.

This follows the pattern established for linear operations, extending int16 support to add operations with hardware-specific testing.

Changes:
- Add INT16 dtype validation support in op_add.py
- Add test_add_tensor_16a8w_tosa_INT test function with U55/U85 pipeline support
- Add U55 and U85 specific 16A8W tests with proper xfail decorators
- Fix U55/U85 test parameter usage (remove unsupported tosa_extensions, clean quantizer function calls)
- Update xfail reasons to consistent 'Vela compilation fails with Invalid arguments' pattern

ghstack-source-id: 306434516
ghstack-source-id: 306434516
@exported-using-ghexport

Differential Revision: [D80510463](https://our.internmc.facebook.com/intern/diff/D80510463/)
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D80510463



Add 16A8W quantization support and comprehensive tests for the add operation in ExecutorTorch ARM backend targeting Ethos U55 and U85 NPUs.

This follows the pattern established for linear operations, extending int16 support to add operations with hardware-specific testing.

Changes:
- Add INT16 dtype validation support in op_add.py
- Add test_add_tensor_16a8w_tosa_INT test function with U55/U85 pipeline support
- Add U55 and U85 specific 16A8W tests with proper xfail decorators
- Fix U55/U85 test parameter usage (remove unsupported tosa_extensions, clean quantizer function calls)
- Update xfail reasons to consistent 'Vela compilation fails with Invalid arguments' pattern

Differential Revision: [D80510463](https://our.internmc.facebook.com/intern/diff/D80510463)

cc digantdesai freddan80 per zingo oscarandersson8218

[ghstack-poisoned]
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D80510463



Add 16A8W quantization support and comprehensive tests for the add operation in ExecutorTorch ARM backend targeting Ethos U55 and U85 NPUs.

This follows the pattern established for linear operations, extending int16 support to add operations with hardware-specific testing.

Changes:
- Add INT16 dtype validation support in op_add.py
- Add test_add_tensor_16a8w_tosa_INT test function with U55/U85 pipeline support
- Add U55 and U85 specific 16A8W tests with proper xfail decorators
- Fix U55/U85 test parameter usage (remove unsupported tosa_extensions, clean quantizer function calls)
- Update xfail reasons to consistent 'Vela compilation fails with Invalid arguments' pattern

Differential Revision: [D80510463](https://our.internmc.facebook.com/intern/diff/D80510463)

cc digantdesai freddan80 per zingo oscarandersson8218

[ghstack-poisoned]
Ninja91 added a commit that referenced this pull request Sep 4, 2025
Pull Request resolved: #13789

Add 16A8W quantization support and comprehensive tests for the add operation in ExecutorTorch ARM backend targeting Ethos U55 and U85 NPUs.

This follows the pattern established for linear operations, extending int16 support to add operations with hardware-specific testing.

Changes:
- Add INT16 dtype validation support in op_add.py
- Add test_add_tensor_16a8w_tosa_INT test function with U55/U85 pipeline support
- Add U55 and U85 specific 16A8W tests with proper xfail decorators
- Fix U55/U85 test parameter usage (remove unsupported tosa_extensions, clean quantizer function calls)
- Update xfail reasons to consistent 'Vela compilation fails with Invalid arguments' pattern

ghstack-source-id: 307540287
ghstack-source-id: 307540287
@exported-using-ghexport

Differential Revision: [D80510463](https://our.internmc.facebook.com/intern/diff/D80510463/)
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D80510463



Add 16A8W quantization support and comprehensive tests for the add operation in ExecutorTorch ARM backend targeting Ethos U55 and U85 NPUs.

This follows the pattern established for linear operations, extending int16 support to add operations with hardware-specific testing.

Changes:
- Add INT16 dtype validation support in op_add.py
- Add test_add_tensor_16a8w_tosa_INT test function with U55/U85 pipeline support
- Add U55 and U85 specific 16A8W tests with proper xfail decorators
- Fix U55/U85 test parameter usage (remove unsupported tosa_extensions, clean quantizer function calls)
- Update xfail reasons to consistent 'Vela compilation fails with Invalid arguments' pattern

Differential Revision: [D80510463](https://our.internmc.facebook.com/intern/diff/D80510463)

cc digantdesai freddan80 per zingo oscarandersson8218

[ghstack-poisoned]
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D80510463



Add 16A8W quantization support and comprehensive tests for the add operation in ExecutorTorch ARM backend targeting Ethos U55 and U85 NPUs.

This follows the pattern established for linear operations, extending int16 support to add operations with hardware-specific testing.

Changes:
- Add INT16 dtype validation support in op_add.py
- Add test_add_tensor_16a8w_tosa_INT test function with U55/U85 pipeline support
- Add U55 and U85 specific 16A8W tests with proper xfail decorators
- Fix U55/U85 test parameter usage (remove unsupported tosa_extensions, clean quantizer function calls)
- Update xfail reasons to consistent 'Vela compilation fails with Invalid arguments' pattern

Differential Revision: [D80510463](https://our.internmc.facebook.com/intern/diff/D80510463)

cc digantdesai freddan80 per zingo oscarandersson8218

[ghstack-poisoned]
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D80510463

@Ninja91 Ninja91 added the release notes: arm Changes to the ARM backend delegate label Sep 5, 2025


Add 16A8W quantization support and comprehensive tests for the add operation in ExecutorTorch ARM backend targeting Ethos U55 and U85 NPUs.

This follows the pattern established for linear operations, extending int16 support to add operations with hardware-specific testing.

Changes:
- Add INT16 dtype validation support in op_add.py
- Add test_add_tensor_16a8w_tosa_INT test function with U55/U85 pipeline support
- Add U55 and U85 specific 16A8W tests with proper xfail decorators
- Fix U55/U85 test parameter usage (remove unsupported tosa_extensions, clean quantizer function calls)
- Update xfail reasons to consistent 'Vela compilation fails with Invalid arguments' pattern

Differential Revision: [D80510463](https://our.internmc.facebook.com/intern/diff/D80510463)

cc digantdesai freddan80 per zingo oscarandersson8218

[ghstack-poisoned]
Ninja91 added a commit that referenced this pull request Sep 6, 2025
Pull Request resolved: #13789

Add 16A8W quantization support and comprehensive tests for the add operation in ExecutorTorch ARM backend targeting Ethos U55 and U85 NPUs.

This follows the pattern established for linear operations, extending int16 support to add operations with hardware-specific testing.

Changes:
- Add INT16 dtype validation support in op_add.py
- Add test_add_tensor_16a8w_tosa_INT test function with U55/U85 pipeline support
- Add U55 and U85 specific 16A8W tests with proper xfail decorators
- Fix U55/U85 test parameter usage (remove unsupported tosa_extensions, clean quantizer function calls)
- Update xfail reasons to consistent 'Vela compilation fails with Invalid arguments' pattern

ghstack-source-id: 308024224
ghstack-source-id: 308024224
@exported-using-ghexport

Differential Revision: [D80510463](https://our.internmc.facebook.com/intern/diff/D80510463/)
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D80510463



Add 16A8W quantization support and comprehensive tests for the add operation in ExecutorTorch ARM backend targeting Ethos U55 and U85 NPUs.

This follows the pattern established for linear operations, extending int16 support to add operations with hardware-specific testing.

Changes:
- Add INT16 dtype validation support in op_add.py
- Add test_add_tensor_16a8w_tosa_INT test function with U55/U85 pipeline support
- Add U55 and U85 specific 16A8W tests with proper xfail decorators
- Fix U55/U85 test parameter usage (remove unsupported tosa_extensions, clean quantizer function calls)
- Update xfail reasons to consistent 'Vela compilation fails with Invalid arguments' pattern

Differential Revision: [D80510463](https://our.internmc.facebook.com/intern/diff/D80510463)

cc digantdesai freddan80 per zingo oscarandersson8218

[ghstack-poisoned]
Ninja91 added a commit that referenced this pull request Sep 7, 2025
Pull Request resolved: #13789

Add 16A8W quantization support and comprehensive tests for the add operation in ExecutorTorch ARM backend targeting Ethos U55 and U85 NPUs.

This follows the pattern established for linear operations, extending int16 support to add operations with hardware-specific testing.

Changes:
- Add INT16 dtype validation support in op_add.py
- Add test_add_tensor_16a8w_tosa_INT test function with U55/U85 pipeline support
- Add U55 and U85 specific 16A8W tests with proper xfail decorators
- Fix U55/U85 test parameter usage (remove unsupported tosa_extensions, clean quantizer function calls)
- Update xfail reasons to consistent 'Vela compilation fails with Invalid arguments' pattern

ghstack-source-id: 308046738
ghstack-source-id: 308046738
@exported-using-ghexport
@bypass-github-pytorch-ci-checks

Differential Revision: [D80510463](https://our.internmc.facebook.com/intern/diff/D80510463/)
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D80510463



Add 16A8W quantization support and comprehensive tests for the add operation in ExecutorTorch ARM backend targeting Ethos U55 and U85 NPUs.

This follows the pattern established for linear operations, extending int16 support to add operations with hardware-specific testing.

Changes:
- Add INT16 dtype validation support in op_add.py
- Add test_add_tensor_16a8w_tosa_INT test function with U55/U85 pipeline support
- Add U55 and U85 specific 16A8W tests with proper xfail decorators
- Fix U55/U85 test parameter usage (remove unsupported tosa_extensions, clean quantizer function calls)
- Update xfail reasons to consistent 'Vela compilation fails with Invalid arguments' pattern

Differential Revision: [D80510463](https://our.internmc.facebook.com/intern/diff/D80510463)

cc digantdesai freddan80 per zingo oscarandersson8218

[ghstack-poisoned]
Ninja91 added a commit that referenced this pull request Sep 7, 2025
Pull Request resolved: #13789

Add 16A8W quantization support and comprehensive tests for the add operation in ExecutorTorch ARM backend targeting Ethos U55 and U85 NPUs.

This follows the pattern established for linear operations, extending int16 support to add operations with hardware-specific testing.

Changes:
- Add INT16 dtype validation support in op_add.py
- Add test_add_tensor_16a8w_tosa_INT test function with U55/U85 pipeline support
- Add U55 and U85 specific 16A8W tests with proper xfail decorators
- Fix U55/U85 test parameter usage (remove unsupported tosa_extensions, clean quantizer function calls)
- Update xfail reasons to consistent 'Vela compilation fails with Invalid arguments' pattern

ghstack-source-id: 308052889
ghstack-source-id: 308052889
@exported-using-ghexport
@bypass-github-pytorch-ci-checks

Differential Revision: [D80510463](https://our.internmc.facebook.com/intern/diff/D80510463/)
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D80510463



Add 16A8W quantization support and comprehensive tests for the add operation in ExecutorTorch ARM backend targeting Ethos U55 and U85 NPUs.

This follows the pattern established for linear operations, extending int16 support to add operations with hardware-specific testing.

Changes:
- Add INT16 dtype validation support in op_add.py
- Add test_add_tensor_16a8w_tosa_INT test function with U55/U85 pipeline support
- Add U55 and U85 specific 16A8W tests with proper xfail decorators
- Fix U55/U85 test parameter usage (remove unsupported tosa_extensions, clean quantizer function calls)
- Update xfail reasons to consistent 'Vela compilation fails with Invalid arguments' pattern

Differential Revision: [D80510463](https://our.internmc.facebook.com/intern/diff/D80510463)

cc digantdesai freddan80 per zingo oscarandersson8218

[ghstack-poisoned]
Ninja91 added a commit that referenced this pull request Sep 7, 2025
Pull Request resolved: #13789

Add 16A8W quantization support and comprehensive tests for the add operation in ExecutorTorch ARM backend targeting Ethos U55 and U85 NPUs.

This follows the pattern established for linear operations, extending int16 support to add operations with hardware-specific testing.

Changes:
- Add INT16 dtype validation support in op_add.py
- Add test_add_tensor_16a8w_tosa_INT test function with U55/U85 pipeline support
- Add U55 and U85 specific 16A8W tests with proper xfail decorators
- Fix U55/U85 test parameter usage (remove unsupported tosa_extensions, clean quantizer function calls)
- Update xfail reasons to consistent 'Vela compilation fails with Invalid arguments' pattern

ghstack-source-id: 308053642
ghstack-source-id: 308053642
@exported-using-ghexport
@bypass-github-pytorch-ci-checks
@bypass-github-pytorch-ci-checks
@bypass-github-executorch-ci-checks

Differential Revision: [D80510463](https://our.internmc.facebook.com/intern/diff/D80510463/)
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D80510463

@facebook-github-bot facebook-github-bot merged commit 66e38a9 into gh/Ninja91/5/base Sep 7, 2025
290 of 294 checks passed
@facebook-github-bot facebook-github-bot deleted the gh/Ninja91/5/head branch September 7, 2025 06:41
Ninja91 added a commit that referenced this pull request Sep 8, 2025
This PR was created by the merge bot to help merge the original PR into
the main branch.
ghstack PR number: #13789 by
@Ninja91
^ Please use this as the source of truth for the PR details, comments,
and reviews
ghstack PR base:
https://github.com/pytorch/executorch/tree/gh/Ninja91/5/base
ghstack PR head:
https://github.com/pytorch/executorch/tree/gh/Ninja91/5/head
Merge bot PR base: https://github.com/pytorch/executorch/tree/main
Merge bot PR head:
https://github.com/pytorch/executorch/tree/gh/Ninja91/5/orig
@diff-train-skip-merge

Co-authored-by: Nitin Jain <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported module: arm Issues related to arm backend partner: arm For backend delegation, kernels, demo, etc. from the 3rd-party partner, Arm release notes: arm Changes to the ARM backend delegate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants